Goto

Collaborating Authors

 French Southern and Antarctic Lands


Evaluating Large Language Models for IUCN Red List Species Information

Uryu, Shinya

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are rapidly being adopted in conservation to address the biodiversity crisis, yet their reliability for species evaluation is uncertain. This study systematically validates five leading models on 21,955 species across four core IUCN Red List assessment components: taxonomy, conservation status, distribution, and threats. A critical paradox was revealed: models excelled at taxonomic classification (94.9%) but consistently failed at conservation reasoning (27.2% for status assessment). This knowledge-reasoning gap, evident across all models, suggests inherent architectural constraints, not just data limitations. Furthermore, models exhibited systematic biases favoring charismatic vertebrates, potentially amplifying existing conservation inequities. These findings delineate clear boundaries for responsible LLM deployment: they are powerful tools for information retrieval but require human oversight for judgment-based decisions. A hybrid approach is recommended, where LLMs augment expert capacity while human experts retain sole authority over risk assessment and policy.


From Proxies to Fields: Spatiotemporal Reconstruction of Global Radiation from Sparse Sensor Sequences

Kobayashi, Kazuma, Roy, Samrendra, Koric, Seid, Abueidda, Diab, Alam, Syed Bahauddin

arXiv.org Artificial Intelligence

Accurate reconstruction of latent environmental fields from sparse and indirect observations is a foundational challenge across scientific domains-from atmospheric science and geophysics to public health and aerospace safety. Traditional approaches rely on physics-based simulators or dense sensor networks, both constrained by high computational cost, latency, or limited spatial coverage. We present the Temporal Radiation Operator Network (TRON), a spatiotemporal neural operator architecture designed to infer continuous global scalar fields from sequences of sparse, non-uniform proxy measurements. Unlike recent forecasting models that operate on dense, gridded inputs to predict future states, TRON addresses a more ill-posed inverse problem: reconstructing the current global field from sparse, temporally evolving sensor sequences, without access to future observations or dense labels. Demonstrated on global cosmic radiation dose reconstruction, TRON is trained on 22 years of simulation data and generalizes across 65,341 spatial locations, 8,400 days, and sequence lengths from 7 to 90 days. It achieves sub-second inference with relative L2 errors below 0.1%, representing a >58,000X speedup over Monte Carlo-based estimators. Though evaluated in the context of cosmic radiation, TRON offers a domain-agnostic framework for scientific field reconstruction from sparse data, with applications in atmospheric modeling, geophysical hazard monitoring, and real-time environmental risk forecasting.


MIRAI: Evaluating LLM Agents for Event Forecasting

Ye, Chenchen, Hu, Ziniu, Deng, Yihe, Huang, Zijie, Ma, Mingyu Derek, Zhu, Yanqiao, Wang, Wei

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employing LLM agents for predicting international events, which can influence decision-making and shape policy development on an international scale. Despite such a growing interest, there is a lack of a rigorous benchmark of LLM agents' forecasting capability and reliability. To address this gap, we introduce MIRAI, a novel benchmark designed to systematically evaluate LLM agents as temporal forecasters in the context of international events. Our benchmark features an agentic environment with tools for accessing an extensive database of historical, structured events and textual news articles. We refine the GDELT event database with careful cleaning and parsing to curate a series of relational prediction tasks with varying forecasting horizons, assessing LLM agents' abilities from short-term to long-term forecasting. We further implement APIs to enable LLM agents to utilize different tools via a code-based interface. In summary, MIRAI comprehensively evaluates the agents' capabilities in three dimensions: 1) autonomously source and integrate critical information from large global databases; 2) write codes using domain-specific APIs and libraries for tool-use; and 3) jointly reason over historical knowledge from diverse formats and time to accurately predict future events. Through comprehensive benchmarking, we aim to establish a reliable framework for assessing the capabilities of LLM agents in forecasting international events, thereby contributing to the development of more accurate and trustworthy models for international relation analysis.



Knowledge Graph Question Answering via SPARQL Silhouette Generation

Purkayastha, Sukannya, Dana, Saswati, Garg, Dinesh, Khandelwal, Dinesh, Bhargav, G P Shrivatsa

arXiv.org Artificial Intelligence

Knowledge Graph Question Answering (KGQA) has become a prominent area in natural language processing due to the emergence of large-scale Knowledge Graphs (KGs). Recently Neural Machine Translation based approaches are gaining momentum that translates natural language queries to structured query languages thereby solving the KGQA task. However, most of these methods struggle with out-of-vocabulary words where test entities and relations are not seen during training time. In this work, we propose a modular two-stage neural architecture to solve the KGQA task. The first stage generates a sketch of the target SPARQL called SPARQL silhouette for the input question. This comprises of (1) Noise simulator to facilitate out-of-vocabulary words and to reduce vocabulary size (2) seq2seq model for text to SPARQL silhouette generation. The second stage is a Neural Graph Search Module. SPARQL silhouette generated in the first stage is distilled in the second stage by substituting precise relation in the predicted structure. We simulate ideal and realistic scenarios by designing a noise simulator. Experimental results show that the quality of generated SPARQL silhouette in the first stage is outstanding for the ideal scenarios but for realistic scenarios (i.e. noisy linker), the quality of the resulting SPARQL silhouette drops drastically. However, our neural graph search module recovers it considerably. We show that our method can achieve reasonable performance improving the state-of-art by a margin of 3.72% F1 for the LC-QuAD-1 dataset. We believe, our proposed approach is novel and will lead to dynamic KGQA solutions that are suited for practical applications.


Python Computer Vision Course

#artificialintelligence

Learn Computer Vision. Introduction course to Computer Vision with Python. Make Computer Vision Apps? Learn Computer Vision theory? Build a strong portfolio with Computer Vision & Image Processing Projects? Looking to add Computer Vision algorithms in your current software project ? Whatever be your motivation to learn Computer Vision, I can assure you that you’ve come to the right course. You get. Complete course with 1 hour of video tutorials, Source code for all examples in the course. What you'll learn. Use basic Computer Vision techniques. Do image processing. Build: Image Similarity app, Face Detection app and Object Detection app! Master Computer Vision! .


Scientists turn ALBATROSSES into surveillance drones to help track illegal fishing boats

Daily Mail - Science & tech

A team of researchers from the University of La Rochelle in France have converted albatrosses into de facto surveillance drones as part of a project to gather data on illegal fishing boats in the South Pacific and Indian Ocean. The team traveled to popular albatross nesting locations at Amsterdam Island and Kerguelen Island in the Indian Ocean north of Antarctica, and attached small sensors to 169 albatrosses in a procedure that took about 10 minutes per bird. The sensors weigh 65 grams, or around a seventh of a pound, and were equipped with a GPS receiver, a radar antenna, and a satellite communications monitor to track various boat communication systems. The devices were each powered by a small lithium battery that maintains a charge through a small solar panel, according to a report from ArsTechnica. The albatrosses covered more than 18 million square miles between East Africa and New Zealand, gathering data from more than 600,000 GPS locations.


AI/ML Bootcamp

#artificialintelligence

By registering, you agree to the AWS Event Terms and Conditions and the AWS Community Codes of Conduct. By completing this form, I agree that I'd like to receive information from Amazon Web Services, Inc. and its affiliates related to AWS services, events and special offers, and my AWS needs by email and post. You may unsubscribe at any time by following the instructions in the communications received. By completing this form, I agree that I'd like to receive information from Amazon Web Services, Inc. and its affiliates related to AWS services, events and special offers, and my AWS needs by email and post. You may unsubscribe at any time by following the instructions in the communications received.


AI For Marketers: An Introduction and Primer, Second Edition

#artificialintelligence

Keep on file Card Number We do not keep any of your sensitive credit card information on file with us unless you ask us to after this purchase is complete. Your rental will be available for 30 days. Once started, you'll have 72 hours to watch it as much as you'd like! You'll need an account to access this in our app. Please create a password to continue. You agree to our Terms Of Use.


bcr vidcast 107: AI governance, what are AI and ML, and the future is not here yet - Better Communication Results

#artificialintelligence

Vikram Mahidhar reminds us all that AI is only as good as the humans supervising it and programming it. The biases and artefacts that come out of the processing are reflective of the biases programmed in at the beginning. A program trained to recognise totalled car bodies for insurance purposes, for example, will need close supervision of its decision-making outputs, for regulatory and consumer confidence and acceptance of the decision. There is a call and a growth in a new class of AI--one that is explainable, and that builds trust by providing evidence. Vikram also reminds us that a governance strategy is key to engendering trust in our organisation, processes and people.